Adaptive Dynamics for Interacting Markovian Processes 
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Dynamics of information flow in adaptively interacting stochastic processes is studied. We give 
an extended form of game dynamics for Markovian processes and study its behavior to observe in- 
formation flow through the system. Examples of the adaptive dynamics for two stochastic processes 
interacting through matching pennies game interaction are exhibited along with underlying causal 
structure. 
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When studying the interaction and evolution of many 
stochastic processes that are endowed with the ability 
to adapt to their enviroment, a natural question arises: 
how does information flow though the system and, more- 
over, how can we measure or calculate this information 
flow? From the viewpoint of large networks of stochas- 
tic elements, flow of information in the network has been 
studied [l|, 2, S B| • In general, mutual information is not 
a representative measure of information flow in adaptive 
dynamics as its causal structure forms a complex net- 
work, making the concept of information flow unclear. 
To address this problem, we give an extended form of 
game dynamics for interacting Markovian processes and 
investigate information flow quantitatively. 

Suppose that N stochastic processes X±, . . . , Xn are 
interacting with each other. At each time step r, the 
unit n sends a symbol s n £ {0, 1} to the other units and 
receives at most N — 1 symbols from the other units. 
We denote the global system state as s — s% ■ ■ ■ sn- The 
next symbol sent by the unit n, s' n , is dependent on the 
symbol received from the previous global state, s. Local 
transition probabilities for n-th unit are described as 
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tion probabilities (x^ 

denoted by A 

We introduce a local adaptation process to change 

(n) 

transition probabilities x s ,^ s , assuming that adaptation 
is very slow compared with relaxation time of the global 
Markovian process. After the system reaches a station- 
ary state, each unit independently changes its stochastic 
structure by changing its transition probabilities. As- 
suming strong connectivity of the global Markovian ker- 
nel, we study dynamics of transition probabilities in an 
ergodic subspace. This assumption corresponds to per- 
sistency of dynamics of transition probabilities x^l in 



the state space. Time evolution of 
simple stochastic learning through interaction: reinforcc- 
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is driven by 



ments for transition probabilities of the unit n to send 
and 1 in the previous global state s are given by the 
constants and The conditional expectation re- 
inforcements Rg ,\ s to chose each symbols s n ' given the 

(n) 

previous state s are calculated with a s , x Sri '\ s , and the 
unique stationary distribution. For X n , we give adaptive 
dynamics for probabilities of s' n given s for t ~ t + At 
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where (3^ is the learning rate for the unit n. Here At is 
much larger than the relaxation time of the global Marko- 
vian process. The continuous time model is given as 
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x {n ], R {n ], is the 
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conditional expectation of reinforcements over all pos 
sible symbols given the previous system state s. Intu- 
itively, when (iJW (t) Sn >\ s — R™ s (t)) is positive, that is, 
the conditional expectation reinforcement for a symbol 
s n ' given s is greater than the average of the expecta- 
tion reinforcement given s, the logarithmic derivative of 
x i™'|sW increases, and when negative, it decreases. The 
learning rate, (3^ n \ controls the time scales of the adap- 
tive dynamics of each unit n. (See [B| for the derivation 
of this model.) Note that Eq. represents adaptive 
dynamics with finite memories. Higher dimensional cou- 
pled ODEs are required for multiple Markovian process 
and PDEs for non-Markovian process with infinitely long 
memories. 

Suppose that two biased coin tossing processes X and 
Y adaptively interact with each other. They produce a 
pair of symbols ij at each time step, where i and j are ei- 
ther heads (0) or tails (1). At the next time step, X send 
a symbol i' to Y based on the previous pair of symbols 
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ij, and vise versa. If there is a causal interaction with 
one step memory, the global stochastic process becomes 
a simple Markovian process. When X's and F's behavior 
are causally separated, the whole system is a product of 
two biased coin tossing processes (case 10 in Fig. [TJ. 



dyamics. Similarly, for case 10, we have 
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FIG. 1: Possible causal structure (case 1-10): X — » Y in- 
dicates that Y receives symbols sent by X (information flow 
from X to Y). Dashed arrows indicate ignorance of received 
symbols (no information flow). 

Considering Fig. [H the extreme cases are 1 and 10. 
Case 1 corresponds to the situation, "each unit has one 
step memory of the previous global state s," and case 
10 to, "no information of s." Local transition probabil- 
ities of X and Y are given as {xyuj) — P(X' = i'\X = 
i, Y = j), and {y m ) = P(Y' = f\X =i,Y = j), where 
J2i ,x i'\ij — J2j'Vj'\ij ~ 1- The global Markovian ker- 
nel is given with [x^^y^^) where J2i>,j< x i'\i]Vy\i] = l - 
When X and Y match heads (0) or tails (1) of coins 
(00 or 11), Y reinforces the choice, and when they don't 
(01 or 10), X reinforces the choice. This interaction is 
called the matching pennies game in game theory. The 
reinforcements are given by a bi-matrix 
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where < ex,£y < 1- The intereaction matrices, A = 
(dij) and B = (bji), are the reinforcements for X and Y 
for the global state ij. The Nash equilibrium of the game 
(U) in terms of game theory is an uniformly random state 
(1/2,1/2). The conditional expectation reinforcements 
are given by R x ^ = (Ay^)^ and Rj,^ = (Bx|y)j', 

where x^- = (xo|;j, ) T ) and y^- = (y \ij, Di\ij) T ■ Eq. 
([2]) reduces to 



x i' \ ij 
x i' \ ij 
Vj'\ij 

Vj'\ij 



P [(Ay\ij)i> - X|ij • Ay\ij], 
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Eq. (J4j) corresponds to adaptive dynamics for an interact- 
ing Markovian processes in an 8-dimensional state space 
TlijAfj x Ajj, which is in the form of standard game 
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where X|** = (a; |**, ^i|**) T , and y^ = (y \**, J/i|**) T - 
Here, the * indicates ignorance of received symbols. Eq. 
([5]) is, again, standard game dynamics in a 2-dimcnsional 
state space A x x A y . It is known that the dynamics 
of Eq. ([5]) is Hamiltonian with a constant of motion 
H = l//3 x J D(x*||x) + l//3 F D(y*||y), where D is Kull- 
back divergence, and where (x*,y*) is the Nash equilib- 
rium of the game (A,B). The dynamics are neutrally 
stable periodic orbits for all range of parameters ex , ey 
[(J 0]. When the degree of freedom of the Hamilto- 
nian systems is more than 2, and the bi-matrix (A, B) 
gives asymmetric cyclical interaction, the dynamics can 
be chaotic [a, [8|, [9j . Summarizing, if all units have com- 
plete information of the previous global state s (case 1), 
or they are all causally separated with no information of 
s (case 10), we have a family of standard game dynamics 
given by Eqs. (HJ) and ([5|). 

For intermediate cases 2 — 9, showing in Fig. [TJ where 
units have partial information of s, we have explicit sta- 
tionary distribution terms in the adaptive dynamics. As- 
suming the process is ergodic, < Xyuj, VfUj < 1, an 
unique stationary distribution (p(i,j)) exists. We de- 
note the marginal stationary distributions p x = (P{X = 
0),P{X = 1)) T , p Y = {P{Y = 0),P(Y = 1)) T . The 
conditional stationary distribution of i, given the previ- 
ous state j, is denoted as p(i\j) = p(i,j)/p(j), and those 
of j, given the previous state i, as p(j\i) — p(i, j)/p(i). 



For case 2, with R x {ij = (Ay\, 



and R J>\i* = 



Z)j PU\i)(Bx\ij)j' > Eq. (J2]) reduces to 

= l3 X [{Ay\„) l> -x|„- • Ay\ u ], (6) 

= P Y [C£p(j\i)B X{ij ) j , -y |y ■ C£p(MB^)}. 

Vf\i* j j 

Similarly, for case 5, with R X uj = (Ap x )i> and R Y \j^ — 
(Bp Y )j/, we obtain 
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Note that (p(i,j)) are given as a function of (a^iy) 
and (yj'Uj), thus the equations of motion are in a closed 
form. For cases 2 — 9, we have nonlinear couplings with a 
stationary distribution, which is in contrast to the quasi- 
linear coupling of standard game dynamics. Eq. (6) - ([7]) 
are both in an extended form of standard game dynamics. 
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FIG. 2: (Top) Case 1: Neutrally stable quasi-periodic 
tori. (Middle) Case 2: A combination of quasi- 
periodic tori and transients to a heteroclinic cycle. (Bot- 
tom) Case 5: (a) Transients to a heteroclinic cycle 
which consists of vertex saddles (a; |*0; a^o|*i 5 2/o|o*7 2/o|i*) — 
(0,0,0,0), (1,1,0,0), (0,0,1,1), (1,1,1,1) and (b) conver- 
gence to one of infinitely many neutrally stable fixed points 
which gives an uniform stationary distribution p(i,j) = 
|; (in this case, converging to (a; |*o, Xo\*i, Vo\o* , Vo\i*) = 
(0.539057, 0.460943, 0.671772, 0.328228)) are attracting sets. 



Let us now consider several examples. In examples, 
where the parameters are fixed to (5 X — (i Y and ex 
< v = 0.5, we have four types of dynamics: (1) neutrally 
stable periodic motion of Markovian kernel, (2) conver- 
gence to a fixed Markovian kernel that gives a uniform 
stationary distribution, (3) sharp switching among al- 
most deterministic Markovian kernel, (4) a combination 
of (l)-(3). In contrast to the matching pennies game dy- 
namics which shows only neutrally stable periodic orbits, 



we obtain new types of dynamics naturally given by the 
Markovian structure. 

Case 1 (Eq. (3])): Neutrally stable quasi-periodic 
tori are observed. They are simply a product of peri- 
odic orbits in the matching pennies game dynamics. The 
dynamics of Eq. ([5]) is embedded in a subspace in the 



state space, given by x t 



lot 
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=i'|ii 



and 



oo — 

Uj'\oo = Uj'\oi = Vj'\w = Vj'\n- 

Case 2 (Eq. (6)): A combination of the dynamics of 
Eq. (|4|), quasi-periodic tori, and the dynamics of Eq. (J7]), 
transients to a heteroclinic cycle, are observed (Fig. [21 
middle) . One of the infinitely many attracting periodic 
orbits corresponding to periodic orbits in Eqs. ^ is se- 
lected depending on initial conditions. 
Case 5 (Eq. flZD) : Bi-stable dynamics is observed. 
A manifold which gives uniform stationary distribution 
p(hj) — 1/4, is an attracting set. Fixed points on this 
attracting manifold are all neutrally stable. Heteroclinic 
cycles which consists of several vertex saddles are also 
attracting sets. Depending on initial conditions, cither 
convergence to one of the fixed points on the attracting 
manifold or transients to one of the heteroclinic cycles 
are observed (Fig. [2j bottom). 

In standard game dynamics which describes causally 
separated stochastic process, information flow is always 
0. By using the Markovian extention of game dynam- 
ics, we can now quantify bi-directional information flow 
between stochastic units. Eq. ([9]) gives conditional mu- 
tual information of Y and X' given Y and X', which is a 
measure of stochastic dependence of X' and Y (sometime 
called transfer entropy, see 13, II, 12, HI)- Recently, a 
new measure of information flow which describes devia- 
tion of two random variables from causal dependence, is 
formulated by Ay and Polani (ij. Information flow from 
Y to X', given X and Y, is defined by Eq. j9} as a 
measure of causal dependence. 

I(Y : X'\X, Y) 



/ •/ ■ -\ i Pit 
= V P * ,M — FPi FTP 



E^^E p ti ^ Xi ' \v ) lo §E Ptiltfxi' \ij ) 

E^^E^'I^' l °S( X i'\ij)]' ( 8 ) 



I(Y -> X'\X,Y) 

E ^)ptm% j) log v p< f'lyl 

x i'\ij) 

+ E P WpC? ) E X i'HJ lo S( x i' \ij )] ' 
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Iii the case that Y is a fixed information source, 
(j/o|oo, 3/o|oi, 2/o|io,2/o|ii) = (1,0,1,0), the dynamics 0) 
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with f} Y = monotonically converges to an optimal 
(£o|oO)Zo|oi,So|iq,So|ii) = (0,1,0,1). The system state s 
is either 00 or 11 and X is always rewarded. In this case, 



Case 1 



I(X : Y'\X,Y) = 0, I(X 
I(Y : X'\X,Y) = 0, I(Y - 



■Y'\X,Y) = 0, (10) 
X'\X,Y) = log 2. 



There is information flow from Y to X because X re- 
ceives symbols sent by Y and extracts information from 
Y's behavior. Thus, X is not stochastically dependent 
on Y but, is causally dependent on Y. The above mea- 
sure defined by ([9]) clearly captures this property. Thus, 
intuitively, we can say that I(Y — > X'\X,Y) is a more 
appropriate measure of the information flow. 

As shown in Fig. [3J we observe (case 1) aperiodic, 
(case 2) periodic switching among aperiodic, and (case 
5) stationary information flow. In general, information 
flow vanishes when the system state is on a manifold Mq 

defined by = Y,j Pti) x i'\ij ^Uj'\ij = *Z,iP(i)yj'\ij- 
Information flow is maximized to log 2 when the system 
state is on a manifold M\ defined by the set of points 
which have maximal distance from Mq. Case 5 with In- 
stability between a fixed point and hctcloclinic cycle gives 
us a clear example of stationary information flow. Be- 
tween the manifold Mq and Mi we have dynamic flow 
of information such as those in case 1 and 2 in Fig. [3J 
Through adaptation, dynamic information flow emerges 
by keeping rewards as large as possible at each moment, 
and because of the complex game interaction and under- 
lying causal structure. 

The above is an extention of game dynamics for in- 
teracting Markovian processes. If all units have com- 
plete information of the previous global state s, or they 
are all causally separated with no information of s, we 
have a family of standard game dynamics. For interme- 
diate cases with partial information of s, we have ex- 
plicit stationary distribution terms in the equations of 
motion. The presented examples show new types of phe- 
nomena in contrast to standard game dynamics. Dynam- 
ics of information flow between two units is discussed 
based on underlying causal structure. When units are 
ternary information sources, the presented game dynam- 
ics shows chaotic behavior even in the simplest case Eqs. 
© [alalS! . Studying adaptive dynamics for N units with 
heterogeneous game interaction, and with various types 
of causal networks is left for a future work. Rigorous 
information theoretic analysis of the presented adaptive 
dynamics will be covered more elsewhere. The relation- 
ship between global and individual reward structure and 
information flow among units would give us new insights 
in game theory. Applications to ecological and social dy- 
namics, cconophysics, and studies on learning in game 
are all straightforward. 
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sions. N. Ay thanks the Santa Fe Institute for support. 
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FIG. 3: (Top) Case 1: Aperiodic information flow. (Mid- 
dle) Case 2: Periodic switching among aperiodic informa- 
tion flow. (Bottom) Case 5: (a) stationary information flow 
I(Y -> X' : X,Y) = I(X -f Y' : X,Y) = 0. (b) station- 
ary information flow I(Y -> X' : X, Y) = 0.00305395 and 
I(Y -> X' : X, Y) = 0.06023025. 
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